104 research outputs found

    Tree-size bounded alternation

    Get PDF
    AbstractThe size of an accepting computation tree of an alternating Turing machine (ATM) is introduced as a complexity measure. We present a number of applications of tree-size to the study of more traditional complexity classes. Tree-size on ATMs is shown to closely correspond to time on nondeterministic TMs and on nondeterministic auxiliary pushdown automata. One application of the later is a useful new characterization of the class of languages log-space-reducible to context-free languages. Surprising relationships with parallel-time complexity are also demonstrated. ATM computations using at most space S(n) and tree-size Z(n) (simultaneously) can be simulated in alternating space S(n) and time S(n) Ā· log Z(n) (simultaneously). Several well-known simulations, e.g., Savitch's theorem, are special cases of this result. It also leads to improved parallel complexity bounds for many problems in terms of both time and number of ā€œprocessors.ā€ As one example we show that context-free language recognition in time O(log2 n) is possible on several parallel models. Further, this bound is achievable with only a polynomial number of processors, in contrast to all previously known sub-linear time CFL recognizers

    Editors' foreword

    Get PDF

    Compression of next-generation sequencing reads aided by highly efficient de novo assembly

    Full text link
    We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Read sequences are then stored as positions within the assembled contigs. This is combined with statistical compression of read identifiers, quality scores, alignment information, and sequences, effectively collapsing very large datasets to less than 15% of their original size with no loss of information. Availability: Quip is freely available under the BSD license from http://cs.washington.edu/homes/dcjones/quip

    Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    Get PDF
    BACKGROUND: Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. RESULTS: By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3ā€™ UTR structures and correlated expression patterns. CONCLUSIONS: Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function

    A Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes

    Get PDF
    Noncoding RNAs (ncRNAs) are important functional RNAs that do not code for proteins. We present a highly efficient computational pipeline for discovering cis-regulatory ncRNA motifs de novo. The pipeline differs from previous methods in that it is structure-oriented, does not require a multiple-sequence alignment as input, and is capable of detecting RNA motifs with low sequence conservation. We also integrate RNA motif prediction with RNA homolog search, which improves the quality of the RNA motifs significantly. Here, we report the results of applying this pipeline to Firmicute bacteria. Our top-ranking motifs include most known Firmicute elements found in the RNA family database (Rfam). Comparing our motif models with Rfam's hand-curated motif models, we achieve high accuracy in both membership prediction and base-pairā€“level secondary structure prediction (at least 75% average sensitivity and specificity on both tasks). Of the ncRNA candidates not in Rfam, we find compelling evidence that some of them are functional, and analyze several potential ribosomal protein leaders in depth

    A Marfan syndrome gene expression phenotype in cultured skin fibroblasts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Marfan syndrome (MFS) is a heritable connective tissue disorder caused by mutations in the fibrillin-1 gene. This syndrome constitutes a significant identifiable subtype of aortic aneurysmal disease, accounting for over 5% of ascending and thoracic aortic aneurysms.</p> <p>Results</p> <p>We used spotted membrane DNA macroarrays to identify genes whose altered expression levels may contribute to the phenotype of the disease. Our analysis of 4132 genes identified a subset with significant expression differences between skin fibroblast cultures from unaffected controls versus cultures from affected individuals with known fibrillin-1 mutations. Subsequently, 10 genes were chosen for validation by quantitative RT-PCR.</p> <p>Conclusion</p> <p>Differential expression of many of the validated genes was associated with MFS samples when an additional group of unaffected and MFS affected subjects were analyzed (p-value < 3 Ɨ 10<sup>-6 </sup>under the null hypothesis that expression levels in cultured fibroblasts are unaffected by MFS status). An unexpected observation was the range of individual gene expression. In unaffected control subjects, expression ranges exceeding 10 fold were seen in many of the genes selected for qRT-PCR validation. The variation in expression in the MFS affected subjects was even greater.</p

    A new approach to bias correction in RNA-Seq

    Get PDF
    Motivation: Quantification of sequence abundance in RNA-Seq experiments is often conflated by protocol-specific sequence bias. The exact sources of the bias are unknown, but may be influenced by polymerase chain reaction amplification, or differing primer affinities and mixtures, for example. The result is decreased accuracy in many applications, such as de novo gene annotation and transcript quantification
    • ā€¦
    corecore